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Abstract. In 1975, Chaitin introduced his celebrated Omega number, the halting probability of 
a universal Chaitin machine, a universal Turing machine with a prefix-free domain. The Omega 
number's bits are algorithmically random — there is no reason the bits should be the way they are, if 
we define "reason" to be a computable explanation smaller than the data itself. Since that time, only 
two explicit universal Chaitin machines have been proposed, both by Chaitin himself. 

Concrete algorithmic information theory involves the study of particular universal Turing machines, 
about which one can state theorems with specific numerical bounds, rather than include terms like 
0(1). We present several new tiny Chaitin machines (those with a prefix-free domain) suitable for 
the study of concrete algorithmic information theory. One of the machines, which we call Keraia, is 
a binary encoding of lambda calculus based on a curried lambda operator. Source code is included 
in the appendices. 

We also give an algorithm for restricting the domain of blank-endmarker machines to a prefix-free 
domain over an alphabet that does not include the endmarker; this allows one to take many universal 
Turing machines and construct universal Chaitin machines from them. 



1. Introduction 

In 1948, Shannon published his seminal paper on information theory [17]. In Shannon's model, there is 
a sender, a receiver, and a (probably noisy) channel, or pipe. The sender transmits strings of bits over the 
channel, and the receiver records them on arrival. To let the receiver know when to stop listening, the 
message must have a certain pre-arranged structure. The message may end with a binary equivalent of 
"over and out," or perhaps the length of the message is sent first. What matters is that as soon as the last 
bit is sent, the reciever can calculate that that bit was the end of the message; such structures are known 
as instantaneous codes. 
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Not all instantaneous codes are created equal: some can send the same information with far fewer bits 
than others. The sender and receiver can agree on a dictionary that associates common strings with short 
encodings and infrequent strings with longer ones. If the message is typical, this will save considerably 
on the message size. If the message is atypical, there's more information in it, more that is unexpected. 
The shortest possible message size is related to the amount of information the string contains relative to 
the chosen dictionary, the string's complexity. 

In the mid-1960's, Kolmogorov [15], Solomonoff [18], and Chaitin [5] independently proposed the 
idea of using programs to describe the complexity of strings. This gave birth to the field of algorithmic 
information theory (AIT) [7]. The Kolmogorov complexity Ku{s) of a string s is the length of the short- 
est program in the programming language U whose output is s. Solomonoff proposed a weighted sum 
over the programs, but it was dominated by the shortest program, so his approach and Kolmogorov 's are 
roughly equivalent. Chaitin added the restriction that the programs themselves must be codewords in an 
instantaneous code, giving rise to prefix-free AIT. In this model, complexities become true probabilities, 
and Shannon's information theory applies directly. 

Given a prefix-free domain, there is the natural distribution P(x) = 2 - l x l, where \x\ is the length of 
x in bits. One can then ask the question, "What is the probability, given this distribution of inputs, that a 
program will output some string and halt?" Chaitin discovered that the bits of this "halting probability", 
after an initial computable prefix, are pure information [10]: the length of the shortest program that 
computes the first n bits of the halting probability and stops is at least n — c bits long. To calculate 
one more bit, you have to add at least one more bit to your program; there is no description of the 
strings of bits shorter than the strings themselves, modulo some fixed constant. The bits also contain all 
the information about whether each program will halt or not; Chaitin called this halting probability an 
Omega number. 

With the advent of accessible computers, Chaitin proposed studying concrete AIT — theorems about 
specific programming languages, with positive integers as error terms rather than the phrase "some fixed 
constant." To study concrete prefix-free AIT, Chaitin proposed two universal languages, or machines: 
one was a variant of LISP [10]; the other an 1 1-instruction "register machine" [8]. We know of no other 
universal Chaitin machines in the literature. LISP is a complex language, and the register machine, while 
small, is far from minimal. To foster the study of concrete AIT, we propose a few new minimalist Chaitin 
machines. 

2. Definitions 

Fix an alphabet S. The set of all finite strings of elements of £ is denoted £*. A Turing machine M is 
a partial recursive function M : S* — > £*, where "partial" means that M may be undefined on some 
inputs; this handles the cases where the program runs forever. 
A Turing machine M is called a Chaitin machine if its domain 

dom(M) = {x e S* : M(x) halts} 

is prefix-free, i.e. for all x, y € dom(M), either x is a prefix of y, y is a prefix of x, or x is identical to y. 

For any x € S* and Chaitin machine M, the program-size complexity of x with respect to M is 
defined as 

Hm{x) = min{|w| : M(w) = x}. 
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A Chaitin machine U is called universal with respect to a set of machines S if Htj(x) < Hm{%) + 
O(l) for any x € S* and for any M G S. 

It is helpful to consider a Chaitin machine in Shannon's original sender-pipe-receiver model. Borrow- 
ing terminology from concurrent programming, the pipe is a shared resource. The input to the machine is 
held by the sender, a producer. The sender tries to put its bits into the pipe; it blocks if there are more bits 
to send and the pipe is full. When there are no more bits to send, the sender halts. The Chaitin machine 
is the receiver, a consumer. From time to time it tries to get bits out of the pipe, and blocks if the pipe is 
empty. The entire computation is said to halt if the sender halts, the Chaitin machine halts, and the pipe 
is empty. Codewords are those inputs x for which the computation halts, i.e. x 6 dom(M). 

This model makes it easy to see why the domain of a Chaitin machine is prefix-free: any extension of 
a codeword would cause the sender to block, a condition called overflow; any prefix of it would cause the 
Chaitin machine to block, a condition called underflow. A universal Chaitin machine usually reads in a 
self-delimiting program description, the prefix, and then simulates that program acting on the remainder 
of the input. If the input does not contain a proper program description, the universal machine blocks; 
we call this condition a syntax error. 

A blank-endmarker machine (BEM) B is a Chaitin machine in which one symbol of £ — the blank 
endmarker, hereafter denoted o — is reserved. All but the last symbol of the codeword are taken from the 
alphabet (S — {o}), and the codeword is terminated with o. B will request input until it reads the symbol 
o, after which no more input will be requested. It is this blank endmarker that allows the codewords to 
be prefix-free: removing the endmarker will cause an underflow; any symbols following an endmarker 
will cause an overflow. 

We may construct a BEM M' from an arbitrary Turing machine M defined on the alphabet £ as 
follows. First, define the alphabet £' = £ U {o} over which M' operates. Next, define dom(M') = 
{xo \x <G dom(M)}, i.e. we append the symbol o to each string on which M halts. No prefix or extension 
of a codeword is in the domain of M', since every codeword in the domain has exactly one o as the last 
symbol. 

A universal BEM is a BEM defined on the alphabet £' that is universal with respect to all such BEMs. 



Theorem 2.1. A universal BEM B\j exists. 
Proof: 

Let 0" denote the concatenation of n symbols, 
and simulate B n on the remainder of the input. 



Then Bjj can read in in the self-delimiting prefix ra 1 

□ 



Theorem 2.2. No BEM is universal with respect to all Chaitin machines defined on £'. 
Proof: 

For a BEM to be universal over a set S, it must be able to represent the domain of machines in S with 
only a constant increase in the length of the codewords. However, this is impossible. 

Consider the following Chaitin machine C: inputs are concatenations of a self-delimiting program 
p and a string x £ £'*. The program p, when executed, outputs the length of the string, n = \x\. The 
machine C first reads p, then executes it to get n, and then reads x. Finally, C outputs the string x. 

The number of re-symbol strings |S'| n = (|S| + l) n . On the other hand, the number of (n+c) -symbol 
strings available to a BEM is only |S| n+c , because it may only use the symbol o once at the end of the 
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codeword. Since (|E| + l) n grows faster than |S|" +C , there is no constant c such that |S| n+c > (|S| + l) n 
for all n. □ 

We define the relation "universal-with-respect-to" and denote it K Let S ra = {0, 1, . . . , (n — 1)} 
and T! n = {0, 1, . . . , (n — 1), o}. For all n > 2 we have the following: 

Theorem 2.3. Let A n be Chaitin machine that is universal with respect to all Chaitin machines defined 
over £ n , B n be a BEM that is universal with respect to all BEMs defined over H' n , and C n be a Chaitin 
machine that is universal with respect to all Chaitin machines defined over T*' n . Then 

1- C n hB n h A n , but 

2. B n £ C n 

3. and A n % B n . 

Proof: 

The first part of (1) holds because BEMs are Chaitin machines and C n was chosen to be universal with 
respect to that set. 

The second part of (1) holds because B n can simulate A' n , the BEM constructed from A n : it reads a 
self-delimiting program for A n and begins simulating it. If the program requests an input and B n reads 
o, then B n loops forever, simulating the underflow condition. If the program halts, then B n reads one 
more symbol, x; if x ^ o, then B n loops forever, simulating overflow. If the program loops forever on 
its own, then so does B n . Thus the domain of B n is the same as that of A n modulo the final o. 

Item (2) is theorem 2.2. 

Finally, (3) holds because A n is not allowed to use o. It must use a self-delimiting description of the 
input string, and the shortest self-delimiting version of a string x grows like |x| + log*\x\ + 0(1) [3], 
which violates the error bound for universality. □ 

3. Some minimalist machines 

In this section, we review four languages that greatly influenced our designs, and point out why these are 
not universal Chaitin machines. 

3.1. Lambda calculus 

Lambda calculus formed the basis of Church's 1936 negative answer [11, 12] to Hilbert's Entschei- 
dungsproblem (decision problem): is there an algorithm for deciding whether first-order statements are 
universally valid? He showed first that stating the equivalence of two lambda terms was a first-order pred- 
icate, and then that there is no recursive (or computable) function that can compute whether two terms 
are equivalent. Turing independently proved the same result the same year [20], and when he heard 
of Church's result, was quickly able to show that his machines compute the same class of functions as 
Church's lambda terms. 

Everything in lambda calculus is a function; there are no built-in types, data structures, branching 
instructions, or constants, and the only operation is functional composition, or application. Functions 
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take functions as input and return functions as output. To denote his functions, Church used a slightly 
different notation than most mathematicians are used to: rather than 

f(x, y, z) = (definition of f in terms of x,y,z), 

Church wrote* 

/ = Xxyz. (definition of f in terms of x,y,z). 

Application is denoted by concatenation or parentheses, and is left-associative: / 
same as f(x, y) = {y(x{x)))(y). 

There is a universal basis consisting of the two functions (or combinators) 

S = \xyz.xz(yz) and K = Xxy.x 

That is, the function represented by a lambda term may also be represented by a combination of these 
combinators. For example, the identity function / = SKK: 

SKKv = (Xxyz.xz{yz))KKv 
= Kv(Kv) 
= (Xxy.x)v(Kv) 
= v 

In fact, there is an algorithm called lambda abstraction that reduces any lambda term to a combination 
of S and K, and I combinators t that eliminates the need for any variables. To abstract away all the 
variables in a term, begin with the innermost variable v and apply the following rules, then repeat for the 
remaining variables. 

1. abstract (XY) = S(abstract(X)) (abstract (Y)) 

2. abstract (v) = I 

3. if a term X does not depend on v, then abstract (X) = KX 

For example, the reverse-application combinator is Xxy.yx. Abstracting Ay yeilds Xx.SI(Kx); 
abstracting Ax yeilds S(K{SI))(S(KK)I). 

A term is said to be in normal form if the variable to be applied has not been bound to a value. For 
example, Xxy.y is a normal form, but (Xxy.y)S gets reduced to the normal form Xy.y, and (Xy.y)K 
gets reduced to the normal form K. Reducing a term to normal form is equivalent to a Turing machine 
reaching a halting state. Some lambda terms do not have normal forms; these correspond to computations 
that never finish. For example, the term (Xx.xx)(Xx.xx) reduces to itself; it is the lambda-calculus 
equivalent of an infinite loop. 

The output of a lambda calculus computation is the normal form of the term, if it exists. Since normal 
forms can be enumerated a la Godel, there are bijections from normal forms to natural numbers and to 

'Strictly speaking, even the equals operator is just syntactic sugar: lambda terms are anonymous functions. 
tr rhe combinator / is included merely for convenience; it can, of course, be replaced by SKK. 



= Xxy.y(xx)y is the 
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binary strings. For the purposes of this paper, where we define machines to be partial recursive functions 
from binary strings to binary strings, it is convenient to choose the latter. 

Lambda calculus' alphabet consists of parentheses, the symbols for the lambda operator and the 
dot, and symbols for variables. Though one rarely needs more than twenty-six variables, the formalism 
allows for subscripts; therefore, digits for subscripts and an end-of-subscript marker are also included. 
Let k be the number of symbols in the alphabet. 

Since there infinitely many Chaitin machines that halt on each string x, lambda calculus needs to 
be able to encode an arbitrary string with only a constant overhead in order to be a universal Chaitin 
machine. When a bit string increases by one symbol, the number of representable strings increases by a 
factor of k. However, because of the well-formedness requirement that parentheses balance in a lambda 
term, the number of codewords with one more symbol increases by a smaller factor. Any encoding of 
strings necessarily suffers from a slight expansion, and so lambda calculus fails to reach the constant 
overhead bound. 

There are more requirements for a well-formed lambda term that we are ignoring here. We perform 
an exact analysis of a less restrictive case where we have only one parenthesis symbol and one combinator 
in the next section; we'll see that it, too, fails to be a universal Chaitin machine. 



3.2. Iota 

Iota [1] is a minimalist language created by Chris Barker. The universal basis {S, K} suffices to produce 
every lambda term, but it is not necessary. There are one-combinator bases, known as universal com- 
binators. Iota is a very simple universal combinator, Xf.fSK, denoted 0. To make Iota unambiguous, 
there is a prefix operator, 1, for application. Valid programs are preorder traversals of full binary trees. 
In the tables that follow, brackets [•] denote taking the semantics of the argument. 



Syntax 


Semantics 


F -> 1F Fi 


FoKFi]) 


F -> 


Xf.fSK 



Fokker [14] proposed a different universal combinator, Xf.fS(Xxyz.x), which is slightly larger, but 
recovers S and K with fewer applications. 

Like lambda calculus, there will be a normal form of the Iota term if the program halts. As before, 
because normal forms are denumerable, we can make a bijection between binary strings and normal 
forms, ordering both lexically, and output the matching string. Another alternative, advocated by Ben 
Rudiak-Gould [16], is to restrict the output to a subset of normal forms, such as lists of booleans. Any 
program whose normal form is not in this subset is defined to output the empty string, while the list of 
booleans is converted directly to a binary string. 

We illustrate the execution of two simple Iota programs below: 
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100 = (Xf.fSK)(Xf.fSK) 
= (Xf.fSK)SK 
= SSKK 
= SK(KK) 
= I 

1010100 = 1010/ 

= (Xf.fSK)({Xf.fSK)I) 
= (Xf.fSK)(ISK) 

= (Xf.fSK)(SK) (1) 
= SKSK 
= KK(SK) 
= K 

Notice that at step (1) we performed an application within an internal branch. Iota is confluent: it 
does not matter in which order the applications are carried out, because there are no side-effects. 

Iota is not quite a universal Chaitin machine because of the requirement that codewords be preorder 
traversals of a full binary tree. There are C n full binary trees with n + 1 leaves, where C n is the nth 
Catalan number, giving a codeword of length 2n + 1; asymptotically, C n ~ f- — Thus, if we increase 
the length of a codeword by two bits, the number of representable strings only increases by 2 — 3 ig(^rj) 
bits. It is asymptotically close to being a universal Chaitin machine, but doesn't quite make it. Again, 
any encoding of strings within Iota will necessarily suffer from a slight expansion, and will not satisfy 
the 0(1) error requirement. 

3.3. Zot 

Zot [2] is a continuized form of Iota. Here, 1 is a combinator rather than an operator; in Barker's words, 
it is treated "lexically" rather than "syncategoremically." The initial continuation is the trivial one, and 
the current continuation is applied to each combinator in turn. This allows the program to get access to 
each bit of input individually. It also makes Zot a nice Go del numbering, since every blank-terminated 
binary string is a valid Zot codeword and every computable function is represented. 



Syntax 


Semantics 


F -> FB 


[F]([B]) 


F -> o 


Xc.cl 


B ->■ 


Xc.c(Xf.fSK) 


B 1 


XcL.L{XlR.R{Xr.c{lr))) 


B^Q 


Xc.c(0){P), where O = Xabcde.KI 




and P is an output monad. 
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Barker also includes an operator which I have denoted 0, which allows the program to interact with 
an output monad. It is not strictly necessary: we can ignore the operator and, like Iota, consider the 
normal form of the current continuation to be the output. 

Zot is a BEM, and therefore not a universal Chaitin machine. 

3.4. Binary lambda calculus 

Binary lambda calculus (BLC) [19] is a language created by John Tromp in resonse to Chaitin's claim 
[9] that "Lambda calculus is even simpler and more elegant than LISP, but it's unusable. Pure lambda 
calculus with combinators S and K, it's beautifully elegant, but you can't really run programs that way, 
they're too slow." Tromp noted that "There is however nothing intrinsic to A calculus or CL that is 
slow; only such choices as Church numerals for arithmetic can be said to be slow, but one is free to do 
arithmetic in binary rather than in unary," and proposed BLC specifically for studying concrete AIT. 

Rather than follow Church's original notation, Tromp used de Bruijn [13] notation, which eliminates 
the need to use the variable name in the both lambda prefix and in the body of a term. Instead, n refers 
to the variable bound by the nth enclosing A. 



Syntax 


Semantics 


F -> 01F Fi 




F -» OOF 


X[F] 


F -» l n+1 


n 



Any remaining bits are converted to a mZ-terminated list of combinators K and KI to which the 
program is applied. The list is constructed using the pairing cobinator P = Xxyz.zxy, which has the 
property that PXYK = X and PXY(KI) = Y. Thus K and KI behave like booleans with respect to 
P. 

Tromp explicitly states that the normal form of the resulting BLC term is the output. 
Prefix-free BLC uses a rather nonstandard approach. Instead of defining a computer whose domain 
is prefix-free, Tromp redefines the way output is handled: a program is prefix free if and only if 

U(p:z) = (x p ,z), (*) 

where p : z is a lambda term encoding a list whose first few members are the bit string p, followed by the 
tail z, where z may be infinite. Since z is potentially infinite and is arbitrary, U cannot output z without 
processing all of the bits of p and returning z as part of the output. This guarantees that no prefix or 
extension of p has the right form, and thus the set of such p is prefix-free. 

Normal BLC is a BEM, and therefore not a universal Chaitin machine; prefix-free BLC is a BEM 
combined with the definition (*), but does not define a Chaitin machine whose domain matches the set 
{p | (*) holds}. 

We would like to construct universal Chaitin machines from universal BEMs. The first step is to get 
rid of the blank endmarker. In the next section, we describe an algorithm to extract the (possibly empty) 
subset of the domain of a BEM that does not depend on reading the blank endmarker to know when to 
halt. 
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4. Eliminating the blank endmarker 

Some BEMs have the property that their behavior after the final read request does not depend on the 
result of that request. In a BEM B, the last request is always for the o symbol, but in these cases, the 
program does not need the o to know when to halt. We can use this property to define a Chaitin machine 
whose domain is a prefix-free set of programs such that if x € dom(C), then xo E dom(i?). 
We construct C in the following way: 

1. C simulates B up to the point where the first read request is made; if no read requests are made, 
then C loops forever. After a read request, no read is actually performed; rather, 

2. C simulates the behavior of B for all the possibilites for that symbol up to the point where one of 
the branches is about to halt or make another read request. 

3. At that point, the read for the previous symbol is actually performed; if there are no more symbols, 
then C blocks; otherwise the C selects the appropriate branch and the abandons other branches' 
simulations. 

4. C continues executing that branch until it halts or makes a read request. If the selected branch 
halts, then C halts (although if there are symbols remaining in the pipe, the sender will block and 
the computation will fail to halt). If the selected branch performs a read request, C goes to step 
(2). 

In this way, C only needs to simulate |S| concurrent branches at a time, and if B does not need to 
read o to know when to halt, then that read is never actually performed by C. The domain of C is prefix 
free, since C would underflow on any string more than one symbol shorter than a codeword of B and 
overflow on any extension of a codeword of C. 

A Chaitin machine constructed in this way is universal if it can simulate either of the universal 
machines that Chaitin proposed with only an additive constant increase in the size of the input. The 
machine constructed in this way from BLC is universal. 

5. Church's lambda operator as a curried interpreter 

In this section, we take a small aside to introduce a concept used in one of the proposed Chaitin machines 
below, as well as to introduce a bit of new notation. 

Church's lambda operator can be seen as an interpreter. It reads three self-delimiting parameters from 
the input stream — var, body, and replacement — builds data structures representing the application of 
the functions and operators that those parameters describe, and performs alpha and beta reduction on 
the data structures, calling itself recursively. If the process of alpha- and beta-reduction reaches a point 
where it cannot continue, it decodes the data structure into the normal form of the lambda term and 
outputs it as a string of symbols. 

Since an interpreter is merely a function, we can curry it: the function A takes an input var and 
returns a new function A'; likewise, the function A' takes a single input body and returns a function A"; 
the function A" applied to the input replacement yeilds the normal form, if one exists. 

Since curried functions take exactly one input, the application operator becomes strictly binary, and 
the tree of applications is a full binary tree. Programs may be written as preorder traversals of the tree to 
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avoid parentheses. We adopt the backtick (') as a prefix application operator throughout the rest of the 
paper, which behaves identically to Iota's 1 operator; for example, the lambda term S = Xxyz.xz(yz) 
will be written 

"Ax "Ay "Az "xz'yz 

In Appendix A, we include the Javascript source code for a source-to-source filter from this dialect 
of lambda calculus to Javascript. It supports lazy evaluation and can be trivially modified to work with 
any eager language with first-class functions, such as Perl. 

6. A very simple Chaitin machine 

Below, we present a Chaitin-universal combinator for use in Iota. Input requests R go via a monad that 
binds the requests together in lazy-evaluation order. R reads a bit and evaluates to K or 'K I if the bit is 
or 1, respectively. 

The definition of the combinator is optimized, like Fokker's, for the number of applications to recover 
K, S, and R. Codewords are concatenations of programs (preorder traversals of the application tree) and 
(possibly empty) input. 

A ='K'KR 

B ='K 'K 'K 'K 'K 'K 'K K 
C ="Ax'iB 
= "Xx""xC A'K I S 



100 = K 
10100 = S 
1010100 = R 



1 


first program to loop due to 
a syntax error 


00 


first to loop due to overflow 


= with input 




110101000 


first to loop due to underflow 


='R with no input 




1101010000 


first to halt with nonempty 


= 'R0 with input 


input 


= 'K0 





This language is prefix-free. Prefices of programs are not traversals of full binary trees, so they loop 
due to a syntax error. In any halting program, there will be a finite number of applications of the R 
operator, and thus a finite number of bits appended to the end of the program. The execution of any 
prefix of that codeword will block due to underflow, and any extension will block due to overflow. 



M. Stay /Very simple Chaitin machines for concrete AIT 



11 



It is also universal with respect to all Chaitin machines defined over {0, 1}. Chaitin's universal 
machine can be simulated by this one with a program that reads in a parenthesis-balanced LISP S- 
expression and evaluates it: S and K are universal over the lambda terms, and R provides the bits for 
Chaitin's readBit function. 

7. Extending a universal combinator 

One may also construct a Chaitin-universal combinator from any universal combinator U by taking 

= "Pair "Ax "Ay "Az U R, 

where 

Pair = "Ax "Ay "Az "zxy. 

Programs written for U can be converted to use by replacing U with 100. For example, taking 
Iota's combinator U ="A/ "/ S K, we have that 



1100100 


= 'UU 


= 1 


11001100100 


= 'U'UU 


= 'S K 


110011001100100 


= 'U 'U 'U u 


= K 


1100110011001100100 


= 'U'U'U'UU 


= s 


1011001100100 


= '0 'U 'U u 


= R 



8. Keraia, continuized binary lambda calculus with a 6-bit UTM 

Keraia is a BEM that uses a straightforward encoding of the curried A introduced in section 5. We begin 
with three examples: 



1 = 
K = 
S = 



o 

Interpret 

o 

Interpret 

o 

Interpret 



110 

"A 
110 

"A 





x x 
110 

x "A 



10100 

y 



110 10100 110 11000 110 11 



'A 



'A 



'A 



10100 1 

x z ' 



11000 

y z 



The leftmost leaf represents the curried A function; the first right subtree var (if it exists) represents 
a variable; the second right subtree body (if it and var exist) represents the applications of curried 
lambda and variables; the third right subtree replacement (if it and the previous two exist) represents 
the replacement pattern. 

Keraia uses a greedy algorithm while marking variables: it traverses body marking occurrences of 
var, then recursively parses body to mark the rest of the variables. Next, it performs a-reduction and 
/^-reduction until body has reached normal form. Any remaining leaves are replaced by the combinator. 
Finally, Keraia performs lambda-abstraction and returns a combinator. 
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Syntax 


Semantics 


F — > FB 


<[P] [P] 


F -» o 


[0] 


P -» 


"Ac 'c Interpret 


P -» 1 


"Ac "AA 'A "Aa "AP 'P "A6 'c "Pair a 6 



While Zot's 1 combinator applies the left branch to the right, Keraia's 1 merely Pairs the branches. 
Interpret reads in the data structure created by the Pairing, and then interprets it. The behavior of 
the function Interpret is only specified on combinators of the form constructed by applications of the 
combinators and 1 . 

Like Zot, Keraia has a very simple self-interpreting UTM, with the same meaning: 111000 is the 
encoding of "apply the identity operator to what follows." 

In Appendix B, we give Javascript source code for an implementation of Keraia that bootstraps off 
of the curried lambda dialect from section 5. 

9. Keraia as a universal Chaitin machine 

Rather than use a continuized set of combinators to get a universal Turing machine, we can get a universal 
Chaitin machine with a few small modifications. First, we treat 1 lexically, interpreting the first full 
binary tree traversal as the program description; the remaining bits are given to the sender to push through 
the pipe. Also, rather than replace remaining leaves with the combinator in the last step, we replace 
them with R operator. As always, syntax errors (incomplete tree traversals), overflow, and underflow 
cause the machine to loop indefinitely. 

For example, the codeword 111010010100110001 splits into a self-delimiting program (all the bits 
but the last) and the input bit 1 (the last bit). The execution proceeds as follows: 

11101001010011000 ="'Xx'RxI 

= 'RI 

= "K 1 1 (R^'K I upon reading the bit 1) 

= 1 

This modification of Keraia is a universal Chaitin machine, since every Lisp S-expression has an 
equivalent lambda term that is directly encodable, and the R operator behaves identically to Chaitin's 
readBit operator. 

10. Conclusion 

algorithmic information theory has much to say both about physics and philosophy. It would be nice 
to experiment with tiny concrete models, but until now, there were only two programming languages 
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that were universal Chaitin machines. We have given examples of two new universal Chaitin machines, 
a modification to universal combinators that allows them to be Chaitin-universal, and an algorithm for 
constructing a Chaitin machine from a BEM by removing the blank-endmarker and extracting the prefix- 
free subset of those words. 

The complexity of n bits of a Chaitin Omega number is n — c; Calude et al. [4] computed the first 
64 bits of an Omega number, the halting probability of Chaitin's register machine. We know, now, that c 
for that machine is at least 64. The machines proposed above are ideally suited for similar computations. 

Chaitin published an exponential Diophantine equation with one parameter n which has infinitely 
many solutions if and only if the nth bit of Vic (i- e - the halting probability of a particular universal 
Chaitin machine C) is one [8]. These machines should make it possible to producing a smaller instance. 
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A. A lambda-calculus-to-Javascript source-to-source filter 

// The "Be Lazy" wrapper 

function _(x) {return functionO {return x}} 

// Evaluate the parsed source 

function (x){ 

return eval(" (function(){var x="+parse(x) + " ; return x})0") 

} 

// The guts 
function parse (x) { 

if (x.indexOf (' ' ' )==-l) return x; 

var pos=l, count, start, c; 

// find left tree 

for (count=0, start=pos; 

count >= && pos < x. length; 

pos++) 

{ 

c=x . charAt (pos) ; 

count += (c==" »)?1: ((c==' ' II c== ' ~ ' ) ?-l : 0) ; 

} 

var left = x . substring (1 ,pos) ; 

// find right tree 

for (count=0, start=pos; 
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count >= kk pos < x. length; 
pos++) 

{ 

c=x.charAt(pos) ; 

count += (c==" »)?1: ((c==' ' II c=='~')?-l:0) ; 

} 

var right = x . substring(start ,pos) ; 

if (left.substring(0,2)==""') 

return "functionO {return function("+ 

left .substring (2)+") {return "+parse (right) +" ()}} 
return "functionO {return "+parse(left)+" () ("+parse (right) +" 

} 

S=__(""~x "~y 1 "z "x z 'y z") ; 
K=__(»"~x 1 '~y x") ; 
I=__(""~x x") ; 
omega=__(""-x 'x x") ; 
Omega= ( " ' omega omega" ) ; 

// Calling syntax: 

// K()(I)(0mega)(S) = SO 

// or 

// __('""K I Omega S")() = SO 



B. An implementation of Keraia 

// Keraia' s alphabet 
_0=__(""~c 'c Keraia") ; 

// This version of _1 uses Javascript strings 
// instead of the Pair combinator 
_1=__(»<<~C "~L ""Bit L "~c 'c _(0) L " + 

"'"l "~R ""Bit R "~c 'c _(0) R "~r 'c "Cat 1 r"); 



// Assembles the string to interpret 
Cat=f unction () {return 

f unction(x) {return 

f unction(y) {return 
"l"+x()+y() 

}}}; 



// 'Bit _0 = K, 'Bit _1 = 'K I 

Bit=__(""~b "b 'K 'K K 'K 'K 'K 'K 'K 'K I"); 

Keraia=function(){ 

return function (x){ 

function parse (x) { 

if (x.indexOf ('1' )==-!) return "_"+x+" "; 
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var pos=l, count, start, c; 
// find left tree 

for (count=0 ; count >= && pos < x. length; pos++) 
{ c=x. charAt(pos) ; count += (c==' 1 ' I I c=='3' )?1 : -1 ; } 
var left = x. substringd ,pos) ; 

// find right tree 

for (count=0, start=pos; 

count >= && pos < x. length; 

pos++) 

{ c=x.charAt(pos) ; count += (c==' 1 ' I I c=='3' )?1 : -1 ; } 
var right = x . substring(start ,pos) ; 

if (left.substring(0,2)=='10') 
{ 

var arg=left . substring (2) . 

replace (/0/g, '2') . 

replace(/l/g, '3') ; 
return ""~_"+arg+" "+ 

parse (right . replace ( 

new RegExp(left . substring (2) , 'g' ) ,arg)) ; 

> 

return " ' "+parse(left)+parse(right) ; 

> 

return (parse (x()) 

// uncomment this line for the prefix-free version 
// .replace(/_0/g,'R') 

)(); 

} 

} 

epsilon=_0 ; 

I_k = __(" ' " ' 'epsilon _1 _1 _0 _0 _0"); // 1 1 ~x x 

K_k = _( epsilon()(_l)(_l)(_0) (_1) (_0) (_1) (_0) (_0) // " ~x 
(_D(_D(_0) (_0) // <<~y 

(_D(_0)(_1)(_0)(_0) ); // x 

// These functions are only used in the prefix-free version 

InputBits=' ' ; 

// Input monad 
R=f unction(){ 

return function(x){ 

// loop on underflow 

if (InputBits==") Omega () ; 

var c=InputBits . charAt (0) ; 
InputBits=InputBits . substring(l) ; 
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return c==' l'?K() (I) (x) :K() (x) ; 

} 

} 

pf-Keraia=f unction(x) { 
var pos=0, count, c; 

// find a full tree 

for (count=0; count >= && pos < x. length; pos++) 
{ c=x. char At (pos) ; count += (c==' 1 ' I I c=='3' )?1 : -1 ; } 

// loop on syntax error 
if (count>-l) OmegaO ; 

// Take the remainder of the bits as input 
InputBits = x.substring(pos) ; 

var combo = _(Keraia() (_(x.substring(0,pos)))) ; 

// loop on overflow 

if (InputBits != ") OmegaO; 

return combo; 



// pf-Keraia("1001") = I() 



